79 research outputs found

    Gauge Invariant Framework for Shape Analysis of Surfaces

    Full text link
    This paper describes a novel framework for computing geodesic paths in shape spaces of spherical surfaces under an elastic Riemannian metric. The novelty lies in defining this Riemannian metric directly on the quotient (shape) space, rather than inheriting it from pre-shape space, and using it to formulate a path energy that measures only the normal components of velocities along the path. In other words, this paper defines and solves for geodesics directly on the shape space and avoids complications resulting from the quotient operation. This comprehensive framework is invariant to arbitrary parameterizations of surfaces along paths, a phenomenon termed as gauge invariance. Additionally, this paper makes a link between different elastic metrics used in the computer science literature on one hand, and the mathematical literature on the other hand, and provides a geometrical interpretation of the terms involved. Examples using real and simulated 3D objects are provided to help illustrate the main ideas.Comment: 15 pages, 11 Figures, to appear in IEEE Transactions on Pattern Analysis and Machine Intelligence in a better resolutio

    ConViViT -- A Deep Neural Network Combining Convolutions and Factorized Self-Attention for Human Activity Recognition

    Full text link
    The Transformer architecture has gained significant popularity in computer vision tasks due to its capacity to generalize and capture long-range dependencies. This characteristic makes it well-suited for generating spatiotemporal tokens from videos. On the other hand, convolutions serve as the fundamental backbone for processing images and videos, as they efficiently aggregate information within small local neighborhoods to create spatial tokens that describe the spatial dimension of a video. While both CNN-based architectures and pure transformer architectures are extensively studied and utilized by researchers, the effective combination of these two backbones has not received comparable attention in the field of activity recognition. In this research, we propose a novel approach that leverages the strengths of both CNNs and Transformers in an hybrid architecture for performing activity recognition using RGB videos. Specifically, we suggest employing a CNN network to enhance the video representation by generating a 128-channel video that effectively separates the human performing the activity from the background. Subsequently, the output of the CNN module is fed into a transformer to extract spatiotemporal tokens, which are then used for classification purposes. Our architecture has achieved new SOTA results with 90.05 \%, 99.6\%, and 95.09\% on HMDB51, UCF101, and ETRI-Activity3D respectively

    3D Face Recognition under Expressions, Occlusions, and Pose Variations

    Full text link

    A DYNAMIC GEOMETRY-BASED APPROACH FOR 4D FACIAL EXPRESSIONS RECOGNITION

    Get PDF
    International audienceIn this paper we present a fully automatic approach for identity-independent facial expression recognition from 3D video sequences. Towards that goal, we propose a novel approach to extract a scalar field that represents the defor- mations between faces conveying different expressions. We extract relevant features from this deformation field using LDA and then train a dynamic model on these features using HMM. Experiments conducted on BU-4DFE dataset fol- lowing state-of-the-art settings show the effectiveness of the proposed approach

    3D Dynamic Expression Recognition Based on a Novel Deformation Vector Field and Random Forest

    Get PDF
    International audienceThis paper proposes a new method for facial motion extraction to represent, learn and recognize observed expressions, from 4D video sequences. The approach called Deformation Vector Field (DVF) is based on Riemannian facial shape analysis and captures densely dynamic information from the entire face. The resulting temporal vector field is used to build the feature vector for expression recognition from 3D dynamic faces. By applying LDA-based feature space transformation for dimensionality reduction which is followed by a Multiclass Random Forest learning algorithm, the proposed approach achieved 93% average recognition rate on BU-4DFE database and outperforms state-of-art approaches

    Enhancing Gender Classification by Combining 3D and 2D Face Modalities

    Get PDF
    International audienceShape and texture provide different modalities in face-based gender classification. Although extensive works have been reported in the literature, the majority of them are in the scope of shape or texture modality individually. Among them, only a few concern their combination, and to the best of our knowledge, no work considers the combination with the 3D face surface. In our work, we investigate the combination of shape and texture modalities for gender classification, with both the combination of range images and gray images, and the combination of 3D meshes and gray images. In 10-fold subject-independent cross-validation with Random Forest on the FRGC-2.0 dataset, we achieved a correct gender classification rate of 93.27%± 5.16, which outperforms each individual modality and is comparable to the state-of-the-art. Results confirm that shape and texture modalities are complementary, and their combination enhances the performance of face-based gender classification

    Fusion d'Experts pour une Biométrie Faciale 3D Robuste aux Déformations

    Get PDF
    Session "Posters"National audienceNous étudions dans cet article l'apport de la géométrie tridimensionnelle du visage dans la reconnaissance des individus. La principale contribution est d'associer plusieurs experts (matcheurs) de biométrie faciale 3D afin d'achever de meilleures performances comparées aux performances individuelles de chacun, notamment en présence d'expressions. Les experts utilisés sont : (E1) Courbes radiales élastiques, (E2) MS-eLBP, une version étendue multi-échelle de l'opérateur LBP, (E3) l'algorithme de recalage non-rigide TPS, en plus d'un expert de référence (Eref) l'algorithme de recalage rigide connu ICP. Profitant de la complémentarité de chacun des experts, la présente approche affiche un taux d'identification qui dépasse les 99% en présence d'expressions faciales sur la base FRGCv2. Une étude comparative avec l'état de l'art confirme le choix et l'intérêt de combiner plusieurs experts afin d'achever de meilleurs performance

    Calcul statistique sur les variétés de forme pour la l'analyse et la reconnaissance de visage 3D

    No full text
    We propose, in this thesis, a unified Riemannian framework for comparing, deforming, averaging and hierarchically organizing facial surfaces. This framework is applied within the 3D face recognition problem where facial expressions, pose variations, and occlusions are the main challenges of this topic. The facial surfaces are represented by collections of level curves and radial ones. The set of closed curves (level curves) constitute an infinite dimensional sub-manifold and is used to represent the nasal region, the most stable part of the face. The facial surface is represented by an indexed collection of radial curves. In this case, the calculus is simpler and the space of open curves shape is simply the hypersphere of Hilbert space. The comparison in this shape space is done via an "elastic" metric in order to handle non-isometric deformations of facial surfaces. We propose algorithms for computing means and eigenvectors in these nonlinear manifolds and hence algorithms for estimation of missing parts of 3D facial surfaces. Comparison with competitor approaches using a common experimental setting on the FRGCv2, GAVAB, BOSPHORUS databases, shows that our solution is able to obtain, and outperform in some scenarios, the state-of-the-art results.Dans cette thèse, nous proposons un cadre Riemannien pour comparer, déformer, calculer des statistiques et organiser de manière hiérarchique des surfaces faciales. Nous appliquons ce cadre à la biométrie faciale 3D où les défis sont les expressions faciales, les variations de la pose et les occultations du visage par des objets externes. Les surfaces faciales sont repr'esentées par un ensemble de courbes de niveaux et de courbes radiales. L'ensemble des courbes fermées (de niveau) constitue une sous-variété non-linéaire de dimension infinie et est utilisé pour représenter le nez, la partie la plus stable du visage. La surface faciale est présentée, par ailleurs, par une collection indexée de courbes radiales. Dans ce cas, le calcul se simplifie et l'espace des formes des courbes ouvertes se ramène à une hyper sphère de l'espace de Hilbert. La comparaison dans l'espace des formes se fait via une métrique élastique afin de faire face aux d'eformations non-isométriques (ne conservant pas les longueurs) des surfaces faciales. Nous proposons des algorithmes pour calculer les moyennes, les vecteurs propres dans ces variétés non-linéaires et l'estimation des parties manquantes des surfaces faciales 3D. L'approche présentée dans cette thèse a été validée sur des Benchmarks connus (FRGCv2, GAVAB, BOSPHORUS) et obtenu des résultats compétitifs par rapport aux méthodes de l'état de l'art
    • …
    corecore